861 research outputs found
Interindividual variation refuses to go away: a Bayesian computer model of language change in communicative networks
Treating the speech communities as homogeneous entities is not an accurate representation of reality, as it misses some of the complexities of linguistic interactions. Interindividual variation and multiple types of biases are ubiquitous in speech communities, regardless of their size. This variation is often neglected due to the assumption that 'majority rules,' and that the emerging language of the community will override any such biases by forcing the individuals to overcome their own biases, or risk having their use of language being treated as 'idiosyncratic' or outright 'pathological.' In this paper, we use computer simulations of Bayesian linguistic agents embedded in communicative networks to investigate how biased individuals, representing a minority of the population, interact with the unbiased majority, how a shared language emerges, and the dynamics of these biases across time. We tested different network sizes (from very small to very large) and types (random, scale free, and small world), along with different strengths and types of bias(modeled through the Bayesian prior distribution of the agents and the mechanism used for generating utterances: either sampling from the posterior distribution ['sampler'] or picking the value with the maximum probability ['MAP']). The results show that, while the biased agents, even when being in the minority, do adapt their language by going against their a priori preferences, they are far from being swamped by the majority, and instead the emergent shared language of the whole community is influenced by their bias
Extraction automatique de paramĂštres prosodiques pour l'identification automatique des langues
International audienceThe aim of this study is to propose a new approach to Automatic Language Identi - cation: it is based on rhythmic modelling and fundamental frequency modelling and does not require any hand labelled data. First we need to investigate how prosodic or rhythmic information can be taken into account for Automatic Language Identi cation. A new automatically extracted unit, the pseudo syllable, is introduced. Rhythmic and intonative features are then automatically extracted from this unit. Elementary decision modules are de ned with gaussian mixture models. These prosodic modellings are combined with a more classical approach, a vocalic system acoustic modelling. Experiments are conducted on the ve European languages of the MULTEXT corpus: English, French, German, Italian and Spanish. The relevance of the rhythmic parameters and the ef ciency of each system (rhythmic model, fundamental frequency model and vowel system model) are evaluated. The in uence of these approaches on the performances of automatic language identi cation system is addressed. We obtain 91 % of correct identi cation with 21 s. utterances using all the information sources
Rhythmic unit extraction and modelling for automatic language identification
International audienceThis paper deals with an approach to Automatic Language Identification based on rhythmic modelling. Beside phonetics and phonotactics, rhythm is actually one of the most promising features to be considered for language identification, even if its extraction and modelling are not a straightforward issue. Actually, one of the main problems to address is what to model. In this paper, an algorithm of rhythm extraction is described: using a vowel detection algorithm, rhythmic units related to syllables are segmented. Several parameters are extracted (consonantal and vowel duration, cluster complexity) and modelled with a Gaussian Mixture. Experiments are performed on read speech for 7 languages (English, French, German, Italian, Japanese, Mandarin and Spanish) and results reach up to 86 ± 6% of correct discrimination between stress-timed mora-timed and syllable-timed classes of languages, and to 67 ± 8% percent of correct language identification on average for the 7 languages with utterances of 21 seconds. These results are commented and compared with those obtained with a standard acoustic Gaussian mixture modelling approach (88 ± 5% of correct identification for the 7-languages identification task)
Merging Segmental And Rhythmic Features For Automatic Language Identification
International audienc
Speech Technologies for African Languages: Example of a Multilingual Calculator for Education
International audienceThis paper presents our achievements after 18 months of the ALFFA project dealing with African languages technologies. We focus on a multilingual calculator (Android app) that will be demonstrated during the Show and Tell session
Projet RAIVES (Recherche Automatique d'Informations Verbales Et Sonores) vers l'extraction et la structuration de données radiophoniques sur Internet
Rapport de contrat.Internet est devenu un vecteur important de la communication. Il permet la diffusion et l'Ă©change d'un volume croissant de donnĂ©es. Il ne s'agit donc plus seulement de collecter des masses importantes " d'informations Ă©lectroniques ", mais surtout de les rĂ©pertorier, de les classer pour faciliter l'accĂšs Ă l'information utile. Une information, aussi importante soit-elle, sur un site non rĂ©pertoriĂ©, est mĂ©connue. Il ne faut donc pas nĂ©gliger la part du " Web invisible ". Le Web invisible peut se dĂ©finir comme l'ensemble des informations non indexĂ©es, soit parce qu'elles ne sont pas rĂ©pertoriĂ©es, soit parce que les pages les contenant sont dynamiques, soit encore parce que leur nature n'est pas ou difficilement indexable. En effet, la plupart des moteurs de recherche se basent sur une analyse textuelle du contenu des pages, mais ne peuvent prendre en compte le contenu des documents sonores ou visuels. Il faut donc fournir un ensemble d'Ă©lĂ©ments descripteurs du contenu pour structurer les documents afin que l'information soit accessible aux moteurs de recherche. S'agissant de documents sonores, le but de notre projet est donc, d'une part, d'extraire ces informations et, d'autre part, de fournir une structuration des documents afin de faciliter l'accĂšs au contenu. L'indexation par le contenu de documents sonores s'appuie sur des techniques utilisĂ©es en traitement automatique de la parole, mais doit ĂȘtre distinguĂ©e de l'alignement automatique d'un texte sur un flux sonore ou encore de la reconnaissance automatique de la parole. Ce serait alors rĂ©duire le contenu d'un document sonore Ă sa seule composante verbale. Or, la composante non-verbale d'un document sonore est importante et correspond souvent Ă une structuration particuliĂšre du document. Par exemple, dans le cas de documents radiophoniques, on voit l'alternance de parole et de musique, plus particuliĂšrement de jingles, pour annoncer les informations. Ainsi, nous pouvons considĂ©rer un ensemble de descripteurs du contenu d'un document radiophonique : segments de Parole/Musique, " sons clĂ©s ", langue, changements de locuteurs associĂ©s Ă une Ă©ventuelle identification de ces locuteurs, mots clĂ©s et thĂšmes. Cet ensemble peut ĂȘtre bien entendu enrichi. Extraire l'ensemble des descripteurs est sans doute suffisant pour rĂ©fĂ©rencer un document sur Internet. Mais il est intĂ©ressant d'aller plus loin et de donner accĂšs Ă des parties prĂ©cises du document. Chaque descripteur doit ĂȘtre associĂ© Ă un marqueur temporel qui donne accĂšs directement Ă l'information. Cependant, l'ensemble des descripteurs appartenant Ă des niveaux de description diffĂ©rents, leur organisation n'est pas linĂ©aire dans le temps : un mĂȘme locuteur peut parler en deux langues sur un mĂȘme segment de parole, ou encore sur un segment de parole dans une langue donnĂ©e, plusieurs locuteurs peuvent intervenir. Il faut donc aussi ĂȘtre capable de fournir une structuration de l'information sur diffĂ©rents niveaux de reprĂ©sentation
Ville et campagne de Fréjus romaine
En 2006, une fouille dâarchĂ©ologie prĂ©ventive, dĂ©signĂ©e sous le nom de « Villa Romana », a Ă©tĂ© rĂ©alisĂ©e dans le quartier de Villeneuve Ă FrĂ©jus. Durant lâAntiquitĂ© il sâagit dâune zone pĂ©riurbaine situĂ©e entre la ville de Forum Iulii et le dĂ©bouchĂ© de lâArgens. Connu depuis longtemps en raison de la prĂ©sence dâun Ă©difice thermal toujours en Ă©lĂ©vation, le quartier a Ă©tĂ© fouillĂ© Ă plusieurs occasions et est interprĂ©tĂ© comme Ă©tant lâemplacement du camp de la flotte, Ă©tabli aprĂšs la bataille dâActium. Celui-ci se transforme progressivement durant le Ier siĂšcle apr. J.-C. en quartier suburbain au fur et Ă mesure que se dĂ©veloppe Forum Iulii. Le secteur fouillĂ© se situe dans la partie sud du camp, bordĂ©e par la mer durant les premiers temps de lâAntiquitĂ©. La fouille a permis de rĂ©vĂ©ler la prĂ©sence dâune plage amĂ©nagĂ©e. Les terrains ont ensuite Ă©tĂ© rapidement gagnĂ©s sur la mer, en raison dâune avancĂ©e rapide du littoral, que des Ă©tudes rĂ©centes ont permis de bien connaitre Ă FrĂ©jus. Des jardins y sont alors amĂ©nagĂ©s. A partir du IIe siĂšcle, cet espace est transformĂ© en zone agricole, et constitue lâillustration de lâexploitation de la campagne aux portes de FrĂ©jus, et cela, jusquâĂ la fin de lâAntiquitĂ©. Sâensuit une longue pĂ©riode dâabandon de plusieurs siĂšcles, avant que lâespace ne soit Ă nouveau vouĂ© Ă lâagriculture et ce jusquâĂ lâorĂ©e des annĂ©es soixante. Depuis, le dĂ©veloppement de la ville actuelle de FrĂ©jus a de nouveau transformĂ© ce quartier en zone urbaine. Cet ouvrage, publiĂ© quelques annĂ©es seulement aprĂšs la fouille, prĂ©sente lâensemble des Ă©tudes archĂ©ologiques et palĂ©oenvironnementales, rĂ©alisĂ©es Ă lâoccasion de cette opĂ©ration, largement pluridisciplinaire. Elles fournissent un contexte environnemental nouveau pour ce quartier antique et permettent de redĂ©finir un paysage Ă partir dâanalyses bioarchĂ©ologiques et palĂ©oĂ©cologiques rĂ©centes. LâĂ©tude de lâensemble des mobiliers archĂ©ologiques est Ă©galement prĂ©sentĂ©e, en suivant la chronologie et lâĂ©volution de ce quartier Ă travers lâAntiquitĂ© et lâĂ©poque moderne
Les droits disciplinaires des fonctions publiques : « unification », « harmonisation » ou « distanciation ». A propos de la loi du 26 avril 2016 relative à la déontologie et aux droits et obligations des fonctionnaires
The production of tt⟠, W+bb⟠and W+cc⟠is studied in the forward region of protonâproton collisions collected at a centre-of-mass energy of 8 TeV by the LHCb experiment, corresponding to an integrated luminosity of 1.98±0.02 fbâ1 . The W bosons are reconstructed in the decays WââÎœ , where â denotes muon or electron, while the b and c quarks are reconstructed as jets. All measured cross-sections are in agreement with next-to-leading-order Standard Model predictions.The production of , and is studied in the forward region of proton-proton collisions collected at a centre-of-mass energy of 8 TeV by the LHCb experiment, corresponding to an integrated luminosity of 1.98 0.02 \mbox{fb}^{-1}. The bosons are reconstructed in the decays , where denotes muon or electron, while the and quarks are reconstructed as jets. All measured cross-sections are in agreement with next-to-leading-order Standard Model predictions
The MAORY first-light adaptive optics module for E-ELT
The MAORY adaptive optics module is part of the first light instrumentation suite for the E-ELT. The MAORY project phase B is going to start soon. This paper contains a system-level overview of the current instrument design
Observation of the B0 â Ï0Ï0 decay from an amplitude analysis of B0 â (Ï+Ïâ)(Ï+Ïâ) decays
Protonâproton collision data recorded in 2011 and 2012 by the LHCb experiment, corresponding to an integrated luminosity of 3.0 fbâ1 , are analysed to search for the charmless B0âÏ0Ï0 decay. More than 600 B0â(Ï+Ïâ)(Ï+Ïâ) signal decays are selected and used to perform an amplitude analysis, under the assumption of no CP violation in the decay, from which the B0âÏ0Ï0 decay is observed for the first time with 7.1 standard deviations significance. The fraction of B0âÏ0Ï0 decays yielding a longitudinally polarised final state is measured to be fL=0.745â0.058+0.048(stat)±0.034(syst) . The B0âÏ0Ï0 branching fraction, using the B0âÏKâ(892)0 decay as reference, is also reported as B(B0âÏ0Ï0)=(0.94±0.17(stat)±0.09(syst)±0.06(BF))Ă10â6
- âŠ